Take-home Exercise 1: Water Points in Osun, Nigeria

Published

January 30, 2023

Modified

February 12, 2023

1 Overview

This exercise looks into the accessibility of water in Osun, a state located in southwestern Nigeria. Crucially, we are analysing how are the water points are located in Osun and if the functionality of these water points has relation with their locations.

1.1 Dataset Sources

  • WPdx+ Dataset (CSV) - Taken from WPdx Global Data Repositories. It provides the locations of the water points.

  • State boundary GIS Datasets of Nigeria - Taken from Humanitarian Data Exchange. It gives the geospatial data of Nigeria, in particular the boundaries of its states and Local Government Areas (LGA).

2 Import R Packages

These are the R packages we will be using:

pacman::p_load(sf, tidyverse, funModeling, tmap, spatstat, maptools, raster, sfdep)

3 Import Datasets

3.1 WPdx+ Dataset

Since we are only analysing Osun’s water points, we will directly filter the water points by the country and state.

wp_osun <- read_csv("data/aspatial/Water_Point_Data_Exchange_-_Plus__WPdx__.csv") %>% filter(`#clean_country_name` == "Nigeria" &
                                                                                              `#clean_adm1` == "Osun")
head(wp_osun, n=5)
# A tibble: 5 × 70
  row_id #sour…¹ #lat_…² #lon_…³ #repo…⁴ #stat…⁵ #wate…⁶ #wate…⁷ #wate…⁸ #wate…⁹
   <dbl> <chr>     <dbl>   <dbl> <chr>   <chr>   <chr>   <chr>   <chr>   <chr>  
1 429123 GRID3      8.02    5.06 08/29/… Unknown <NA>    <NA>    Tapsta… Tapsta…
2  70566 Federa…    7.32    4.79 05/11/… No      Protec… Well    Mechan… Mechan…
3  70578 Federa…    7.76    4.56 05/11/… No      Boreho… Well    Mechan… Mechan…
4  66401 Federa…    8.03    4.64 04/30/… No      Boreho… Well    Mechan… Mechan…
5 422190 GRID3      7.87    4.88 08/29/… Unknown <NA>    <NA>    Tapsta… Tapsta…
# … with 60 more variables: `#facility_type` <chr>,
#   `#clean_country_name` <chr>, `#clean_adm1` <chr>, `#clean_adm2` <chr>,
#   `#clean_adm3` <chr>, `#clean_adm4` <chr>, `#install_year` <dbl>,
#   `#installer` <chr>, `#rehab_year` <lgl>, `#rehabilitator` <lgl>,
#   `#management_clean` <chr>, `#status_clean` <chr>, `#pay` <chr>,
#   `#fecal_coliform_presence` <chr>, `#fecal_coliform_value` <dbl>,
#   `#subjective_quality` <chr>, `#activity_id` <chr>, `#scheme_id` <chr>, …

3.2 Nigeria Osun State

As the geospatial data of Nigeria is being imported as a Simple Feature DataFrame, we want to ensure that the dataframe is projected in the right ESPG codes (i.e., 26391, 26392, 26393).

NGA <- st_read(dsn = "data/geospatial/nga_adm_osgof_20190417",
               layer = "nga_admbnda_adm2_osgof_20190417") %>%
  st_transform(crs = 26392)
Reading layer `nga_admbnda_adm2_osgof_20190417' from data source 
  `C:\deadline2359\IS415-GAA\Take-home_Ex\Take-home_Ex01\data\geospatial\nga_adm_osgof_20190417' 
  using driver `ESRI Shapefile'
Simple feature collection with 774 features and 16 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 2.668534 ymin: 4.273007 xmax: 14.67882 ymax: 13.89442
Geodetic CRS:  WGS 84
head(NGA, n=5)
Simple feature collection with 5 features and 16 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 481088 ymin: 98142.39 xmax: 1248985 ymax: 1079710
Projected CRS: Minna / Nigeria Mid Belt
  Shape_Leng  Shape_Area   ADM2_EN ADM2_PCODE  ADM2_REF ADM2ALT1EN ADM2ALT2EN
1  0.2370744 0.001523921 Aba North   NG001001 Aba North       <NA>       <NA>
2  0.2624772 0.003531104 Aba South   NG001002 Aba South       <NA>       <NA>
3  3.0753158 0.326867840    Abadam   NG008001    Abadam       <NA>       <NA>
4  2.5379842 0.068378506     Abaji   NG015001     Abaji       <NA>       <NA>
5  0.6871498 0.014528691      Abak   NG003001      Abak       <NA>       <NA>
                    ADM1_EN ADM1_PCODE ADM0_EN ADM0_PCODE       date    validOn
1                      Abia      NG001 Nigeria         NG 2016-11-29 2019-04-17
2                      Abia      NG001 Nigeria         NG 2016-11-29 2019-04-17
3                     Borno      NG008 Nigeria         NG 2016-11-29 2019-04-17
4 Federal Capital Territory      NG015 Nigeria         NG 2016-11-29 2019-04-17
5                 Akwa Ibom      NG003 Nigeria         NG 2016-11-29 2019-04-17
  validTo                     SD_EN SD_PCODE                       geometry
1    <NA>                Abia South  NG00103 MULTIPOLYGON (((548795.5 11...
2    <NA>                Abia South  NG00103 MULTIPOLYGON (((547286.1 11...
3    <NA>               Borno North  NG00802 MULTIPOLYGON (((1248985 104...
4    <NA> Federal Capital Territory  NG01501 MULTIPOLYGON (((510864.9 57...
5    <NA>      Akwa Ibom North West  NG00302 MULTIPOLYGON (((594269 1209...

4 Data Handling

4.1 WPdx+ Dataset

Here, st_as_sfc() converts the column “New Georeferenced Column” in the WPdx+ dataset, which references the water points’ locations, into a Simple Feature geometry Column.

wp_osun$Geometry = st_as_sfc(wp_osun$`New Georeferenced Column`)
head(wp_osun, 5)
# A tibble: 5 × 71
  row_id #sour…¹ #lat_…² #lon_…³ #repo…⁴ #stat…⁵ #wate…⁶ #wate…⁷ #wate…⁸ #wate…⁹
   <dbl> <chr>     <dbl>   <dbl> <chr>   <chr>   <chr>   <chr>   <chr>   <chr>  
1 429123 GRID3      8.02    5.06 08/29/… Unknown <NA>    <NA>    Tapsta… Tapsta…
2  70566 Federa…    7.32    4.79 05/11/… No      Protec… Well    Mechan… Mechan…
3  70578 Federa…    7.76    4.56 05/11/… No      Boreho… Well    Mechan… Mechan…
4  66401 Federa…    8.03    4.64 04/30/… No      Boreho… Well    Mechan… Mechan…
5 422190 GRID3      7.87    4.88 08/29/… Unknown <NA>    <NA>    Tapsta… Tapsta…
# … with 61 more variables: `#facility_type` <chr>,
#   `#clean_country_name` <chr>, `#clean_adm1` <chr>, `#clean_adm2` <chr>,
#   `#clean_adm3` <chr>, `#clean_adm4` <chr>, `#install_year` <dbl>,
#   `#installer` <chr>, `#rehab_year` <lgl>, `#rehabilitator` <lgl>,
#   `#management_clean` <chr>, `#status_clean` <chr>, `#pay` <chr>,
#   `#fecal_coliform_presence` <chr>, `#fecal_coliform_value` <dbl>,
#   `#subjective_quality` <chr>, `#activity_id` <chr>, `#scheme_id` <chr>, …

4.1.0.1 Create Simple Feature DataFrame

st_sf() then converts wp_osun from a tibble to a Simple Feature DataFrame.

wp_sf <- st_sf(wp_osun, crs=4326)
head(wp_sf, 5)
Simple feature collection with 5 features and 70 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 4.563998 ymin: 7.317741 xmax: 5.06 ymax: 8.031187
Geodetic CRS:  WGS 84
# A tibble: 5 × 71
  row_id #sour…¹ #lat_…² #lon_…³ #repo…⁴ #stat…⁵ #wate…⁶ #wate…⁷ #wate…⁸ #wate…⁹
   <dbl> <chr>     <dbl>   <dbl> <chr>   <chr>   <chr>   <chr>   <chr>   <chr>  
1 429123 GRID3      8.02    5.06 08/29/… Unknown <NA>    <NA>    Tapsta… Tapsta…
2  70566 Federa…    7.32    4.79 05/11/… No      Protec… Well    Mechan… Mechan…
3  70578 Federa…    7.76    4.56 05/11/… No      Boreho… Well    Mechan… Mechan…
4  66401 Federa…    8.03    4.64 04/30/… No      Boreho… Well    Mechan… Mechan…
5 422190 GRID3      7.87    4.88 08/29/… Unknown <NA>    <NA>    Tapsta… Tapsta…
# … with 61 more variables: `#facility_type` <chr>,
#   `#clean_country_name` <chr>, `#clean_adm1` <chr>, `#clean_adm2` <chr>,
#   `#clean_adm3` <chr>, `#clean_adm4` <chr>, `#install_year` <dbl>,
#   `#installer` <chr>, `#rehab_year` <lgl>, `#rehabilitator` <lgl>,
#   `#management_clean` <chr>, `#status_clean` <chr>, `#pay` <chr>,
#   `#fecal_coliform_presence` <chr>, `#fecal_coliform_value` <dbl>,
#   `#subjective_quality` <chr>, `#activity_id` <chr>, `#scheme_id` <chr>, …

4.1.0.2 Re-Projection

Like in importing the geospatial data of Nigeria Osun State, st_transform() is used to re-project the geographic coordinate system to projected coordinate system as the projected coordinate system allows for better analysis involving measurements.

wp_sf <- wp_sf %>%
  st_transform(crs = 26392)

5 Geospatial Data Cleaning

5.1 Filtering Redundant Fields

Looking at Nigeria’s state boundary Dataset, there are many fields and rows that not necessary for our project. Hence, we will filter for those of Osun, and select only “ADM1_EN” and “ADM2_EN” fields which hold information on 1st and 2nd level administrative zones. Inclusion of “ADM2_EN” allows us to check if there’s a duplication of LGAs that will impact map generation.

NGA <- NGA  %>%
  filter(`ADM1_EN` == "Osun") %>%
  dplyr::select(c(3:4, 8:9))
head(NGA, 5)
Simple feature collection with 5 features and 4 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 189625.6 ymin: 338755.8 xmax: 272238.5 ymax: 447175.5
Projected CRS: Minna / Nigeria Mid Belt
         ADM2_EN ADM2_PCODE ADM1_EN ADM1_PCODE                       geometry
1       Aiyedade   NG030001    Osun      NG030 MULTIPOLYGON (((213526.6 34...
2       Aiyedire   NG030002    Osun      NG030 MULTIPOLYGON (((212542.6 40...
3 Atakumosa East   NG030003    Osun      NG030 MULTIPOLYGON (((265746.8 37...
4 Atakumosa West   NG030004    Osun      NG030 MULTIPOLYGON (((248871.4 40...
5     Boluwaduro   NG030005    Osun      NG030 MULTIPOLYGON (((266092.2 43...

5.2 Checking for Duplicated Name

You can see that there is no duplicated LGA in the Osun state.

NGA$ADM2_EN[duplicated(NGA$ADM2_EN) == TRUE]
character(0)

5.3 Excluding Unnecessary Data Points

st_intersection() is particularly chosen to exclude coordinate points which the water points’ locations do not overlap with Osun state’s geography. If we are to use wp_sf as it is, we may include water points not actually in Osun due to data errors.

wp_sf <- st_intersection(NGA, wp_sf)

6 Data Wrangling for Water Point Data

As depicted in the frequency graph below, we can take note that there are rows with no input data, “NA”. However, the rest can roughly be split into two categories. One being functional water points while the other being non-functional water points.

funModeling::freq(data = wp_sf,
     input = 'X.status_clean')

                    X.status_clean frequency percentage cumulative_perc
1                       Functional      2232      41.94           41.94
2                   Non-Functional      1894      35.59           77.53
3                             <NA>       734      13.79           91.32
4      Functional but needs repair       236       4.43           95.75
5 Non-Functional due to dry season       146       2.74           98.49
6        Functional but not in use        61       1.15           99.64
7                        Abandoned        15       0.28           99.92
8         Abandoned/Decommissioned         4       0.08          100.00

We don’t want our final chart to have “NA” just as it is. It will not be very readable for our audience. Hence “NA” will be renamed as “unknown”.

wp_sf_nga <- wp_sf %>%
  rename(status_clean = 'X.status_clean') %>%
  dplyr::select(status_clean) %>%
  mutate(status_clean = replace_na(
    status_clean, "unknown"
  ))

funModeling::freq(data = wp_sf_nga,
     input = 'status_clean')

                      status_clean frequency percentage cumulative_perc
1                       Functional      2232      41.94           41.94
2                   Non-Functional      1894      35.59           77.53
3                          unknown       734      13.79           91.32
4      Functional but needs repair       236       4.43           95.75
5 Non-Functional due to dry season       146       2.74           98.49
6        Functional but not in use        61       1.15           99.64
7                        Abandoned        15       0.28           99.92
8         Abandoned/Decommissioned         4       0.08          100.00

6.1 Extract Water Point Data

In the two code chunks below, we focus on splitting the functional, non-functional and unknown water points so further analysis can be done on the different groups.

wp_functional <- wp_sf_nga %>%
  filter(status_clean %in%
           c("Functional",
             "Functional but not in use",
             "Functional but needs repair"))
wp_nonfunctional <- wp_sf_nga %>%
  filter(status_clean %in%
           c("Abandoned/Decommissioned",
             "Abandoned",
             "Non-Functional due to dry season",
             "Non-Functional",
             "Non functional due to dry season"))
wp_unknown <- wp_sf_nga %>%
  filter(status_clean == "unknown")

6.2 Performing Point-in-Polygon Count

Using st_intersects(), we want to know if the geospatial coordinates of Osun intersect with the geospatial coordinates of the water points. If yes, the dataframe NGA_wp will get the total number of functional, non-functional, unknown and overall water points in each 2nd administrative zone.

NGA_wp <- NGA %>%
  mutate('total_wp' = lengths(
    st_intersects(NGA, wp_sf_nga))) %>%
  mutate('wp_functional' = lengths(
    st_intersects(NGA, wp_functional))) %>%
  mutate('wp_nonfunctional' = lengths(
    st_intersects(NGA, wp_nonfunctional))) %>%
  mutate('wp_unknown' = lengths(
    st_intersects(NGA, wp_unknown)))
head(NGA_wp, 5)
Simple feature collection with 5 features and 8 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 189625.6 ymin: 338755.8 xmax: 272238.5 ymax: 447175.5
Projected CRS: Minna / Nigeria Mid Belt
         ADM2_EN ADM2_PCODE ADM1_EN ADM1_PCODE                       geometry
1       Aiyedade   NG030001    Osun      NG030 MULTIPOLYGON (((213526.6 34...
2       Aiyedire   NG030002    Osun      NG030 MULTIPOLYGON (((212542.6 40...
3 Atakumosa East   NG030003    Osun      NG030 MULTIPOLYGON (((265746.8 37...
4 Atakumosa West   NG030004    Osun      NG030 MULTIPOLYGON (((248871.4 40...
5     Boluwaduro   NG030005    Osun      NG030 MULTIPOLYGON (((266092.2 43...
  total_wp wp_functional wp_nonfunctional wp_unknown
1      389           157              154         78
2      175            89               57         29
3      223            98               92         33
4      246           111              103         32
5      129            63               51         15

From the chart below, we can see that most of LGAs have around 100 to 150 water points. However, looking closer at each group, most LGAs tend to have slightly more functional than non-functional water points.

ggplot(data = NGA_wp,
       aes(x = total_wp)) + 
  geom_histogram(bins=20,
                 color="black",
                 fill="light blue") +
  geom_vline(aes(xintercept=mean(
    total_wp, na.rm=T)),
             color="red", 
             linetype="dashed", 
             size=0.8) +
  ggtitle("Distribution of total water points by LGA") +
  xlab("No. of water points") +
  ylab("No. of\nLGAs") +
  theme(axis.title.y=element_text(angle = 0))

ggplot(data = NGA_wp,
       aes(x = wp_functional)) + 
  geom_histogram(bins=20,
                 color="black",
                 fill="dark blue") +
  geom_vline(aes(xintercept=mean(
    wp_functional, na.rm=T)),
             color="red", 
             linetype="dashed", 
             size=0.8) +
  ggtitle("Distribution of functional water points by LGA") +
  xlab("No. of water points") +
  ylab("No. of\nLGAs") +
  theme(axis.title.y=element_text(angle = 0))

ggplot(data = NGA_wp,
       aes(x = wp_nonfunctional)) + 
  geom_histogram(bins=20,
                 color="black",
                 fill="orange") +
  geom_vline(aes(xintercept=mean(
    wp_nonfunctional, na.rm=T)),
             color="red", 
             linetype="dashed", 
             size=0.8) +
  ggtitle("Distribution of non-functional water points by LGA") +
  xlab("No. of water points") +
  ylab("No. of\nLGAs") +
  theme(axis.title.y=element_text(angle = 0))

ggplot(data = NGA_wp,
       aes(x = wp_unknown)) + 
  geom_histogram(bins=20,
                 color="black",
                 fill="grey") +
  geom_vline(aes(xintercept=mean(
    wp_unknown, na.rm=T)),
             color="red", 
             linetype="dashed", 
             size=0.8) +
  ggtitle("Distribution of unknown water points by LGA") +
  xlab("No. of water points") +
  ylab("No. of\nLGAs") +
  theme(axis.title.y=element_text(angle = 0))

7 Map Visualisations

wp_functional["functionality"] = "functional"
wp_nonfunctional["functionality"] = "nonfunctional"
wp_nonunknown <- rbind(wp_functional, wp_nonfunctional)
head(wp_nonunknown, 5)
Simple feature collection with 5 features and 2 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 212202 ymin: 349210.1 xmax: 270497.9 ymax: 403822.5
Projected CRS: Minna / Nigeria Mid Belt
                   status_clean functionality                  geometry
1   Functional but needs repair    functional   POINT (212810 386707.6)
8                    Functional    functional POINT (228798.9 403822.5)
28                   Functional    functional POINT (270497.9 377476.9)
1.1                  Functional    functional   POINT (212202 349210.1)
18                   Functional    functional POINT (259331.9 399591.4)

Creating a simple interactive map, we can easily see that most of the water points are located towards the north of Osun, leaving the south with little. This is in spite just zooming out a little, the south is closer to the sea.

tmap_mode("view")
tm_shape(wp_nonunknown) +
tm_dots(col = "functionality",
        pal = c("functional" = "darkblue", "nonfunctional" = "orange"),
        title = "Non-Functional") +
  tm_view(set.zoom.limits = c(5,25),
          set.view = 9) 
tmap_mode("plot")

Looking at the functional and non-functional water points separately, the point maps below reflect the same trend as the above map where the water points are largely located towards the north.

wp_functional_map <-  tm_shape(NGA) + 
  tm_fill() + 
  tm_borders() + 
  tm_shape(wp_functional) +
  tm_dots(col = "status_clean",
         title = "Functional",
         palette = "darkblue",
         legend.show = FALSE) +
  tm_layout(title="Functional") +
  tm_view(set.view = 9) 
wp_nonfunctional_map <- tm_shape(NGA) + 
  tm_fill() + 
  tm_borders() + 
  tm_shape(wp_nonfunctional) +
  tm_dots(col = "status_clean",
         title = "Non-Functional",
         palette = "orange",
         legend.show = FALSE) +
  tm_layout(title="Non-Functional") +
  tm_view(set.view = 9) 


tmap_arrange(wp_functional_map, wp_nonfunctional_map,
             asp = 1,
             ncol = 2)

8 First-order Spatial Point Patterns Analysis

Using First-order Spatial Point Patterns Analysis, we hope to observe the variations of densities around the study area, which in our case will be how do the water points spread across the state.

To proceed with producing a raster map, we will be making use of spatstat.

As spatstat needs its inputs to be of ppp object type, we need to convert the Simple Feature DataFrame to ppp, which will need to go through a few conversions.

8.1 Conversion of Datatypes

8.1.1 Converting sf data frames to sp’s Spatial* class

wp_functional_spatial <- as_Spatial(wp_functional)
wp_nonfunctional_spatial <- as_Spatial(wp_nonfunctional)
wp_nonunknown_spatial <- as_Spatial(wp_nonunknown)
NGA_spatial <- as_Spatial(NGA)

8.1.2 Converting sp’s Spatial* Class into Generic sp Format

wp_functional_sp <- as(wp_functional_spatial, "SpatialPoints")
wp_nonfunctional_sp <- as(wp_nonfunctional_spatial, "SpatialPoints")
wp_nonunknown_sp <- as(wp_nonunknown_spatial, "SpatialPoints")
NGA_sp <- as(NGA_spatial, "SpatialPolygons")

8.1.3 Converting Generic sp Format into spatstat’s ppp Format

wp_functional_ppp <- as(wp_functional_sp, "ppp")
wp_nonfunctional_ppp <- as(wp_nonfunctional_sp, "ppp")
wp_nonunknown_ppp <- as(wp_nonunknown_sp, "ppp")

Below shows the of both functional and non-functional water points, using ppp object. We already start to better see where the concentrations of water points are.

plot(wp_nonunknown_ppp, main="Water Points \n(excluding missing data)")

plot(wp_functional_ppp, main="Functional Water Points")

plot(wp_nonfunctional_ppp, main="Non-Functional Water Points")

8.2 Check for Duplicate Data Points

Duplicated data points should be removed as processes under spatial point patterns analysis are largely assumed to be simple (i.e., no duplicated data). The below code chunk will check if any duplicated data point is in our ppp objects.

any(duplicated(wp_functional_ppp))
[1] FALSE
any(duplicated(wp_nonfunctional_ppp))
[1] FALSE

8.3 Creating owin Object

An owin object will be created to assist us in confining the data points to only Osun.

NGA_owin <- as(NGA_sp, "owin")
wp_functional_ppp = wp_functional_ppp[NGA_owin]
wp_nonfunctional_ppp = wp_nonfunctional_ppp[NGA_owin]
wp_nonunknown_ppp = wp_nonunknown_ppp[NGA_owin]
plot(NGA_owin)

8.4 Kernel Density Estimation (KDE)

We will need to rescale the KDE values as the current values in meters will not be easily understood.

wp_functional_ppp.km <- rescale(wp_functional_ppp, 1000, "km")
wp_nonfunctional_ppp.km <- rescale(wp_nonfunctional_ppp, 1000, "km")
wp_nonunknown_ppp.km <- rescale(wp_nonunknown_ppp, 1000, "km")

8.4.1 Computing KDE using Automatic Bandwidth Selection Method

bw.ppl() is chosen as it highlights clusters more clearly compared to bw.diggle(), while being not misleading like the huge area highlighted resulted from using bw.CvL() or bw.scott().

8.4.1.1 All excluding unknown

wp_nonunknown_bw <- density(wp_nonunknown_ppp.km,
                              sigma = bw.ppl,
                              edge = TRUE,
                              kernel = "epanechnikov")
plot(wp_nonunknown_bw)

8.4.1.2 Functional

wp_functional_bw <- density(wp_functional_ppp.km,
                              sigma = bw.ppl,
                              edge = TRUE,
                              kernel = "epanechnikov")
plot(wp_functional_bw)

8.4.1.3 Non-Functional

wp_nonfunctional_bw <- density(wp_nonfunctional_ppp.km,
                              sigma = bw.ppl,
                              edge = TRUE,
                              kernel = "epanechnikov")
plot(wp_nonfunctional_bw)

8.4.2 Converting KDE Output into Grid Object

The raster maps below more visually showcase the water points are located. Like mentioned above, we can see much of the water points are located in the northern side of Osun. Glancing at the scales beside of the maps, we will notice that the highlighted surfaces only spread around 25km at their peak in a state around 8,500km^2.

Areas with high density of functional water points seem to also show similar high density of non-functional water points. We will see if this is true further down.

8.4.2.1 All excluding unknown

gridded_wp_nonunknown_bw <- as.SpatialGridDataFrame.im(wp_nonunknown_bw)
spplot(gridded_wp_nonunknown_bw)

8.4.2.2 Functional Water Points

gridded_wp_functional_bw <- as.SpatialGridDataFrame.im(wp_functional_bw)
spplot(gridded_wp_functional_bw)

8.4.2.3 Non-Functional Water Points

gridded_wp_nonfunctional_bw <- as.SpatialGridDataFrame.im(wp_nonfunctional_bw)
spplot(gridded_wp_nonfunctional_bw)

8.4.3 Converting Gridded Output into Raster

As we want to use tmap to showcase a raster map, the CRS is added to the object.

kde_wp_functional_bw_raster <- raster(gridded_wp_functional_bw)
projection(kde_wp_functional_bw_raster) <- CRS("+init=EPSG:26392")
kde_wp_functional_bw_raster
class      : RasterLayer 
dimensions : 128, 128, 16384  (nrow, ncol, ncell)
resolution : 0.8948485, 0.9616045  (x, y)
extent     : 176.5032, 291.0438, 331.4347, 454.5201  (xmin, xmax, ymin, ymax)
crs        : +init=EPSG:26392 
source     : memory
names      : v 
values     : -1.671786e-15, 9.409052  (min, max)
kde_wp_nonfunctional_bw_raster <- raster(gridded_wp_nonfunctional_bw)
projection(kde_wp_nonfunctional_bw_raster) <- CRS("+init=EPSG:26392")
kde_wp_nonfunctional_bw_raster
class      : RasterLayer 
dimensions : 128, 128, 16384  (nrow, ncol, ncell)
resolution : 0.8948485, 0.9616045  (x, y)
extent     : 176.5032, 291.0438, 331.4347, 454.5201  (xmin, xmax, ymin, ymax)
crs        : +init=EPSG:26392 
source     : memory
names      : v 
values     : -1.306349e-15, 8.406528  (min, max)

8.4.4 Visualising the Output in tmap

8.4.4.1 Functional

tmap_mode("view") +
tm_basemap('OpenStreetMap') +
tm_shape(kde_wp_functional_bw_raster) +
  tm_raster("v") 
tmap_mode("plot")

8.4.4.2 Non-Functional

tmap_mode("view") +
tm_basemap('OpenStreetMap') +
tm_shape(kde_wp_nonfunctional_bw_raster) +
  tm_raster("v") 
tmap_mode("plot")

8.5 Nearest Neighbour Analysis

Here, we will be using clarkevans.test() to conduct an aggregation for a spatial point pattern. The 95% confident interval will be used.

  • Ho = The distribution of water points are randomly distributed.

  • H1 = The distribution of water points are not randomly distributed.

With the p-value lower than the alpha value of 0.05, we reject the null hypothesis and accept that water points are not randomly distributed.

clarkevans.test(wp_nonunknown_ppp.km,
                correction = "none", 
                clipregion = "NGA_owin", 
                alternative = c("clustered"), 
                nsim = 99)

    Clark-Evans test
    No edge correction
    Monte Carlo test based on 99 simulations of CSR with fixed n

data:  wp_nonunknown_ppp.km
R = 0.374, p-value = 0.01
alternative hypothesis: clustered (R < 1)

8.6 Advantages of Kernel Density Map over Point Map

  • Visually, clusters are more visible in kernel density maps, compared to point maps where concentrated points overlapped each other. This is due to KDE is able to smooth the concentrations of points and create a surface fitted over these points. Whereas point maps simply accept that data points are already spreaded out, else those of same or similar coordinates will be depicted overlapped together.
  • The inverse distance weights calculated to generate the kernel density maps can be seen visually in the deeper colours closer to the centre of clusters. In Osun, there are few clusterings observed and each cluster in rather small-size compared to the large landmass of the state. Residents who reside closer to the centre of these cluster will no doubt need to travel less to water points than residents living at the edge or even outside of these clusters.

9 Second-order Spatial Point Patterns Analysis

9.1 Analysing Spatial Point Process Using L-Function

To conduct our second-order spatial point patterns analysis, we are opting to use Besag’s L-Function. It is a normalised Ripley’s K-Function, which measures the distances between a point and its neighbours within each radius.

Using the envelope() of spatstat package, Monte Carlo simulations have be developed to observe the randomness of the water points.

At 95% confident interval, we will set nsim to be 39 and reject the null hypothesis if the p-value is smaller than the alpha value of 0.05.

9.1.1 Functional Water Points

9.1.1.1 Computing L-function estimation

L_wp_functional = Lest(wp_functional_ppp.km, correction = "Ripley")
plot(L_wp_functional, . -r ~ r, 
     ylab= "L(d)-r", xlab = "d(m)", main="Functional Water Points")

9.1.1.2 Performing Complete Spatial Randomness Test

  • Ho = The distribution of functional water points at Osun are randomly distributed.

  • H1 = The distribution of functional water points at Osun are not randomly distributed.

Having L value (i.e., the line) above break line and the upper confidence envelop, the null hypothesis is rejected and we will accept the hypothesis that functional water points are not distributed randomly.

L_functional.csr <- envelope(wp_functional_ppp, Lest, nsim = 39, rank = 1, glocal=TRUE)
plot(L_functional.csr, . - r ~ r, xlab="d", ylab="L(d)-r", main="Functional Water Points")

Important

Due to inputting the wrong variable, you will notice the scale is in meters, rather than the intended kilometers like the graph in 8.1.1.1 Computing L-function estimation.

9.1.2 Non-Functional Water Points

9.1.2.1 Computing L-function estimation

L_wp_nonfunctional = Lest(wp_nonfunctional_ppp.km, correction = "Ripley")
plot(L_wp_nonfunctional, . -r ~ r, 
     ylab= "L(d)-r", xlab = "d(m)", main="Non-Functional Water Points")

9.1.2.2 Performing Complete Spatial Randomness Test

  • Ho = The distribution of non-functional water points at Osun are randomly distributed.

  • H1 = The distribution of non-functional water points at Osun are not randomly distributed.

Similarly, we can observe that the line is way above the upper confidence envelop. We can safely reject the null hypothesis and understand that the spatial clustering is significant.

L_nonfunctional.csr <- envelope(wp_nonfunctional_ppp, Lest, nsim = 39, rank = 1, glocal=TRUE)
plot(L_nonfunctional.csr, . - r ~ r, xlab="d", ylab="L(d)-r", main="Non-Functional Water Points")

Important

Similar to 8.1.1.2 Performing Complete Spatial Randomness Test in depicting functional water points, the scale is in meters, instead of kilometers.

10 Spatial Correlation Analysis

In this section, we are observing the density of functional water points surrounding a non-functional water point and vice-versa. This will prove or disprove the aforementioned theory that areas with high concentrations of functional water points have also higher concentrations of non-functional ones.

Using include_self(), we are grouping a water point and its 5 neighbours, and see if they are co-located.

wp_neighbours <- include_self(
  st_knn(st_geometry(wp_nonunknown), 6) 
)

wp_weights <- st_kernel_weights(wp_neighbours,
                        wp_nonunknown,
                        "gaussian",
                        adaptive = TRUE)
LCLQ <- local_colocation(wp_functional$functionality, wp_nonfunctional$functionality, wp_neighbours, wp_weights, 39)
LCLQ_wp <- cbind(wp_nonunknown, LCLQ)
tmap_mode("view")
tm_shape(NGA) +
  tm_polygons() +
tm_shape(LCLQ_wp) +
  tm_dots(col = "nonfunctional",
          size = 0.01,
         pal = c("orange"),
         border.lwd = 0.5) +
  tm_view(set.zoom.limits = c(5,25),
          set.view = 9) 
tmap_mode("plot")

10.1 Conversion of Datatypes

Like above, we want to utilise Lest() to test whether the positions of functional and non-functional water points have any relations.

10.1.1 Converting sf data frames to sp’s Spatial* class

LCLQ_wp_spatial <- as_Spatial(LCLQ_wp)

10.1.2 Converting sp’s Spatial* Class into Generic sp Format

LCLQ_wp_sp <- as(LCLQ_wp_spatial, "SpatialPoints")

10.1.3 Converting Generic sp Format into spatstat’s ppp Format

LCLQ_wp_ppp <- as(LCLQ_wp_sp, "ppp")

Below shows the results of both functional and non-functional water points, using ppp object.

plot(LCLQ_wp_ppp)

LCLQ_wp_ppp.km <- rescale(LCLQ_wp_ppp, 1000, "km")

10.2 Analysing Spatial Point Process Using L-Function

L_LCLQ_wp = Lest(LCLQ_wp_ppp.km, correction = "Ripley")
plot(L_LCLQ_wp, . -r ~ r, 
     ylab= "L(d)-r", xlab = "d(m)", main="LCLQ of Water Points")

10.3 Performing Complete Spatial Randomness Test

  • Ho = The spatial distribution of functional and non-functional water points are independent from each other.

  • H1 = The spatial distribution of functional and non-functional water points are not independent from each other.

The L value is observed to be greater than the break line and above the upper confidence envelop. Hence the null hypothesis is rejected and we will assume that the spatial distribution of functional and non-functional water points are not independent from each other.

LCLQ_wp.csr <- envelope(LCLQ_wp_ppp.km, Lest, nsim = 39, rank = 1, glocal=TRUE)
Generating 39 simulations of CSR  ...
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38,  39.

Done.
plot(LCLQ_wp.csr, . - r ~ r, xlab="d", ylab="L(d)-r")

11 Conclusion

We have rejected all our null hypotheses in understanding the water points in Osun.

  1. Both functional and non-functional water points in Osun are not distributed randomly.

  2. In addition, the spatial locations of these two groups of water points are dependent on each other, suggesting that areas with high number of functional water points will too have high number of non-functional water points. All these mean that water points in general cluster together.

    1. This also tells us that places outside these areas, though have little non-functional water points, seem to have lower number of functional water points as well. This is worrying in a state with around 4.8 million people, with the raster maps showing that the concentrations of water points are rather small.